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ABSTRACT > 

While "diagnosis is generally considered a vital 
element in reading clinicians 1 expertise, research has revealed that 
even degreed, experienced reading clinicians display little personal 
consistency or agreement with one x another when diagnosing simulated 
cases of reading difficulty. Three studies were conducted to 
determine if systematizing the diagnostic process by providing a 
process model, diagnostic decision aids, and sufficient practice with 
feedback would result in more reliable diagnoses. Subjects were (1) 
mas^r ' s degree students in reading who had some prior course work in 
diagnosis, (2) master's degree candidates with prior teaching 
experience and coursework in reading, and (3) experienced classroom 
teachers with little or no forrftal training in reading or reading 
diagnosis. The results indicated that the training was' successful , 
both with degreed reading clinicians and with teachers who had no 
previous work in reading diagnosis. (Appendixes contain the cue 
inventory for one simulated case, a portion of a diagnostic decision 
aid, a portion of a diagnostic checklist, and tables of data from the 
studies.) (Author/FL) 
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Abstract 

Diagnosis is generally considered a vital element^jLn the expertise of 
reading clinicians. Yet our previous 1 research revealed that even degreed, 
experienced reading" clinicians displayed very low agreement with themselves 
and with one another when diagnosing simulated cases of reading difficulty. 
This paper reports the resultsof three studies designed to see if systematiz 
ing the diagnostic process by providing (1) a process model, (2) diagnostic 

decision aids, and (3) sufficient practice with feedback" would result in reli 

/ 

able diagnosed. The results indicate that the training (which can easily be 

incorporated into typical courses in reading diagnosis) was successful, both 

• • »■ ' • ■ 

with degreed reading clinicians and with teachers who had no previous coufse 

work in reading diagnosis. 

V 



IMPROVING DIAGNOSTIC RELIABILITY 
IN READING THROUGH , TRAINING ' 

John F. Vinsonhaler, Annette B. Weinshpnk, 
Ruth M. Polin, & Christian C. Wagner 1 

.„ A major reason for studying diagnosis of reading difficulties is the 

. ' ' . . .... V 

importance accorded it by nearly all authorities in reading. Diagnosis as the 
basis for remediation is an important principle in the literature and in prac- 
tice (Carter & McGinnis , 1970; Ekwall, 1976; Otto, McMenemy & Smith, 1973; 
Rabinovitch, 1965;^ Smith, 1969;. Smith, Carter & Dapper, 1970; Spache & Spache, 
19*5). H . ♦ " V - 

^At least three major orientations to,ward diagnostic content can be found 
in the literature. Advocates of one approach establish general reading levels 
compared to reading potential (Guszak^l972; Spache, 1976^. Advocates of a 
second view emphasize performance on a set of reading skills. Advocates of a 
third approach us ^diagnosis as determination of causality, that is, under- 
standing the underlying factors that have, caused reading problems. Such an 

7 ■ 
understanding supposedly enables the clinician to prescribe the most appropri- 

ate steeps. '£ or remediation (Carter & McGinnis , 1970; Harris, 1972; Harris, 

. ■ » 

1977; MonroeV 1968; Natchez, 1968; Strang, 1964). 

Regardless of the content of reading diagnosis, nearly all authors agree 
that the diagnosis should form the basis for remediation. However, with^few 
exceptions (Batem'an, 1971; Spache, 1969) , authors have hot dealt with the 
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effect of unreliable diagnosis on developtaent of knowledge a£out treatment 

outcomes. 4 

' ' ' i 

Consider for Example, a -study in which two reraediations for a given 

diagnostic category aire being evaluated.. v If reading diagnosticians demon- 
strate low reliability/ identifying which type of problem a student has is 
essentially a random choice. Assume that one of these reraediations is effec- 
tive. This effective treatment will improve performance, but only for those 
students^ wtp happen to have been diagnosed correctly • Overall, to the .degree 
that the diagnoses are unreliable, the efficacy of a differentially effective 
treatment will be systematically underestimated. Furthermore, reliability of 
diagnosis does not necessarily inform yalidity (one can be reliably wrong). 
Reliability, does , however * permit the correct estimation of remedial effec- 
tiveness (Collen, Rubin, Neyman, Dantzig, Baer , & Siegelaub, 1964). 

Empirical Studies of Diagnosis ' 

There are conflicting reports in the medical literature on the agreement 

among individual physicians on medical judgments. * Several studies indicate 

N ' 
\ " ' •■ • » 

substantial agreement among physicians; others show marked disagreement 

(Cochrane h Garland, 1952; Fletcher, 1952; Garland, 1959; Paton, 1957; 

Yerushalmy, 1955 s , 1969). For example, lerner and Schuyler (1973) suggest that 

groups of clinicians,, working together, can produce diagnostic statements that 

are mutually agreed -upon. , Educational clinicians, working 'alone , however, 

/ 

yield less premising results. 0 • , * 

In a series of observational studies, we analyzed the written diagnoses 
and remedial plans of reading specialists and special-education clinicians to 
determine commonality (group agreement) and individual agreement about 
simulated cases (see Vinsonhaler, Weinshank, Wagner, & Polin, 1983). The 



initial study revealed very low agreement among specialists and in individual M 
diagnoses. This finding was startling, considering thit the sub^^Ets were '-• * 
experienced, highly regarded reading clinicians. We performed a series of 
five additional observational studies to see if these unexpected findings 
•could be replicated and generalized from. We drew new samples from additional 
.populations, including other reading specialists, classroom teachers, and 
'learning disabilities clinicians. In addition, we developed 1 and used new 
simulated cases and case formats. Potential errors that could result from the 
translation of written diagnoses to standardized categories were eliminated 
through use of a standardized diagnostic checklist. Finally, we investigated 
the reliability of diagnostic categories that ware linked to suggested remedi- 
ati,ons and of the reraediations themselves (Weinshank, 1982). Individual diag- 
nostic and remedial reliability remained very low across all ,the studies 
(i.e., clinicians very frequenfly disagreed about what the problems were and 
how to remediate them). Mean iuterclinician reliability averaged 0.03 (Phi) 
and 0.08 (Porter). Mean ; intraclinician reliability averaged 0.21 (Phi) and 
0.20 (Porter). The initial findings on commonality were also confirmed. Mean 
commonality across the studies was only fractionally higher than the minimum 
possible value. 4 " 

These studies show that', as a group, education professionals, including 
reading specialists, produce diagnoses, the content of which shows in aggre- 
gate some signs of conforming to, the recommendations found in the literature. 



That is, diagnoses usually included statements about reading potential , 
strengths and weaknesses in skills, and suspected causal factors (hearing, 
vision, and attitude)*-^ * 



Individually, however, the* diagnoses show significant deviations from the 
recommendations in the literature. First, they include a large number of one- 



time-oniy statements of questionable relevance to 'remediation. Second, they • 
4 I 

systematically fail to mention the reading skills of greatest import to rerae- 

. - » ' ■> * 

diation. Third, even, when important skills are mentioned, these statements 

are not reliably linked with treatment prescriptions (Weinshank, 1982). 

-..'/'■ 

One explanation for this unreliability might be that these studies used 
simulated cases in an experimental environment/ However, the use of actual 
children in a natural setting might further decrease agreement, since a 
child's performance would be expected to change*, thereby introducing unreli- • 

i " 

ability in the data base. * )^ 

The differential effects of using real* and simulated qases has been ^ 
studied in medidine. No differences were found when the diagnoses were com- 
pared for (1) people with real medical problems and (2) people coached to 
simulate the .same medical problems (human simulation). Further, in studies y 
comparing human simulation of medical problems with simulated cases whose for- 
mat was similar to those used in our studies, differences were found in pro- 
cedure, but not in the final diagnoses (Norman & Tugwell, 1981). 

We favor a.secohd explanation for the lo^ diagnostic agreement found in 
our studies: Reading specialists receive inadequate .training. A comparison 
of training programs in medicine and reading is instructive here. Medical 
training is based on (1) an organized body of empirically based knowledge that 
relates specific remedies to specific problems (Copp, 1976; Johnson, ^.1975; 
King, 1976; Puck, 1976; Roos, 1975); (2) systematic techniques governing; the 
collectioA of cues (DeDombal, Leaper, Horrocks, St'aniland, & McCann, 1974; 
Elstein, Shulman, & Sprafka, 1978; Prior, Silberstein, & "stang, 1981); and (3) 
perhaps most importantly, the supervised diagnosis, treatment, and follow-up 



of thousands of cases (Shapiro & Lowensfrein, 1979; Simpson, 1972). By 
contrast, training in reading diagnosis and remediation is Based on (1) non- 
empirically verified theoretical concepts, (2) idiosyncratic cue collection . 
techniques, and (3) supervised diagnosis, remediation, and follow-up on .few 
cases. 

Diagnostic Training Hypothesis . 
Here we report the results of three studies investigating the diagnostic 
training shypothesis that improved clincial training can increas<*yiiagnostic 
reliability. 

** 

A Theory of Clinical Problem Solving 

The problem-solving behavior of clinicians in medical and other prof es- 
sions.has led to a theory of how diagnostic and treatment decisions should be 
made (OeGowin & DeGowin, 1976) and to observation of how they actually seem to 
be made (Bordage, 1982; Elstein, Shulman, & Sprafka, 1978). According to tile 
theory, there are two participants in the clinical problem-solving setting. 
The first is any complex system (whether an interaction between a case and a 
clinician or an individual and a clinician) referred to as a case. The proper 
functioning of. the case is inferred from its performance on certain critical 
variables. * . y 

The secbnd participant is a problem solver, the clinician. The clinician 
maintains cases anil trifes to improve the case (a 'human's) performance. The 
interaction- between clinician and s£se is usually Initiated by a problem with 
case performance (DeGowin x & DeGowin, 1976). The actions taken by the clini- 
cian have been organized around the terms "diagnosis'* and "treatment" and are 
all logically based upon the clinician's model of process (i.e., how critical 
performances and causal factors are related). f 



The principal explanatory device in the empirical theory is clinical 
memory. Memory consists of (1) associations between cues (case information) 
and potential problems and (2) associations between problems and treatments. 
Decisions are driven by hypothesis testing (i.e., the generation of a set of 
likely problems and the collection of cues to rule in or rule out the hypoth- 
esized problems). The principal means of validating this theory is artificial 
intelligence (e.g., computer generated diagnoses) and computer-simulation ex- 
periments, which predict the problem-solving behavior of real clinicians. The 
behavior predicted is the making of diagnostic judgments, given case inforraa- 
tion. Such studies in reading (Gil, Wagner & Vinsonhaler, 1978; Wagner, 1982) 
show that the theory predicts the reliable portions of reading clinicians 1 
behavior • 

Because the heart of i the theory is clinical memory and clinical memory is 



dependent on a model of process, it follows that a model of the reading pro- 
cess must form the basis for the design of a training program in reading. 

A Model of Reading 

The model chosen to guide the training studies described here is the 
Model of Reading and Learning to Read (MORAL) first developed by George 
Sherman of Michigan State, University and subsequently expanded and adapted for 
these studies (Cureton, Stewart, & Eatriarca, 1980; Weinshank, Cureton, & 
Blatt, 1980). The model describes a series of critical performances th^t a 
skilled reader must demonstrate, together with the concurrent cognitive 
skills, personal ^nd environmental factors, learning history, and learning 
skills that would enable and sustain the critical performances. 

In this training model the render (1) receives input from the environ- 
ment; (2) processes this input in conjunction with his/her own memory of past 

N i 
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events; and (3) produces an output that affects its memory* the environment,, 
or both. A particular reader, for example, attempting a particular reading 
task, receives as. input the requirements of the task. This input, together 
with past knowledge of reading and language, are processed in some way, and 
outputs are produced. Some effects are not observable (e.g., changes in' 
memory) and some are (e.g., performance on the reading task as measured in 
some way). 

In our training studies, we found seven reading and language performances 
critical to effectives reading. To the degree that these performances are in- 
adequate* mastery of slme reading tasks may be impeded, 

1. instant word recognition performance, defined as the ability to 
recognize a certain set of. words instantly 

2. decoded word recognition, defined as the ability to recognize a 
set of words using various association strategies (e.g., soupd- 
symbol association) 

3. vocabulary, defined as the ability to give word meanings 

4. oral residing, defined as the ability to read ttext aloud with 
appropriate phrasing, fluency and intonation 



ERLC 



5- silent reading comprehension, defined as the ability to answer 
■\ specific questions on text read silently 

I ; 

I 6. . listening comprehension, defined as the ability to answer 
j specific questions on text read aloud by someone else 

, 7. attention/motivation, defined as the ability to activate and 
maintain concentration on the task at hand 

The MORAL goes further than specification of the critical performances. 
Forieach critical performance this model specifies the associated causal fac- 
tors (i.e., the child and environmental factors that affect his or her per- 
formance). For example, if the child has poor instant word recognition, the 
MORAL suggests investigating probable* causal factors such an poor visual ilis~ 

crimination, insufficient reading practice, and so on. 

\ 

\ 
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Re quirements of Effective Diagnostic Training 

'). Three features characterized the effective diagnostic training used in 
these ^studies. First, Instruction' must provide* training on a modal .>/ the 
reading proceae to 'serve as the foundation for the organization of clinlcaL 
memory. Second, Instruction should include training with decision aide to 
Insure systematic data collection and diagnostic decision making. Finally, 
pmcbice with feedback is necessary to consolidate clinical memory an4 
strategy. 

The MORAL provided a training process for our studies. Clinicians we*~e 
taught to use the MORAL to (1) Identify the. most Important reading perfor- 
inances and (2) Infer significant underlying causes of those performances. For 
the studies reported here, we developed decision and training aids by examin- 
ing the most likely causes for all of the critical . reading performances. From 
these we devised lists of Inferences (see Table 1). Two major categories of 
aids were created: diagnostic/remedial forms for use during diagnostic 
decision making and diagnostic checklists for translating written diagnoses 

Into a common vocabulary. We also provided extended practice with feedback on 

i 

decisions. A senior clinician, operating In accordance witb the model, avalu- 
atod study participants 1 diagnoses of several c<sos», 

Elaboration and refinement of these training elements occurred over the 
course of the three studies reported hero. 

The Initial T ra inin g Study (1977) 

' The purpose of the initial training study was to Investigate the effects 

of non-madot based training on th< participants 1 agreement with a orlterfal 

diagnosis. Specifically Investigated was the impact of non-model based 
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Critical Reading Performances 
Critical Reading Perform ance 
Instant Word Recognition 

Decoded Word Recognition 

Word Comprehension 
Reading Comprehension 

Listening Comprehension 
Oral Reading 

i. 

Attention/Motivation 



Tahle 1 

and Examples' of Causal Factors 

Examples of Causal Factors 

Visual discrimination of words; 
Visual memory of words; 
Decoded word recognition skills 

Auditory memory and discrimination; 

Segmentation/blending; 

Use of context 

Word knowledge; 
Verbal concepts 

Instant word recognition; 
Decoded word recognition; 
Word comprehension; 
Processing strategies 

Text comprehension frames and 

strategies ' N 

Instant word recognition; 
Decoded tford ;recognition; 
Wort comprehension 

Amount and condition of .effective 

practice; 
Attention of the learner; 
Relevance (transferability); of 

practice task; 
Learner's correct perception of the 

task; 

Corrective feedback 



decision aids and practice. The participants were master's degree students^ 
in reading who had already\taken some prior course work. Clinical training 



^To avoid confusion, participants | in the study^iTrl— h^ref erred to as 
participants or students, A student diagnosed to have a reacting^problem will" 
be referred £o as a child. 



consisted of 30 hours of instruction in a five-week class format. There were j 
three groups for which^the treatments differed. One group used real 'cases, \ 
the second used simulated cases, and the, third used simulated cases with 
decision aid_s (diagnostic flow charts). * 

Materials * j 

The stimulus materials used for testing and for training' were four dif- 
ferent simulated cases of /reading difficulty which had been used in the'obser- 

■/,.-.' 

vational studies of reading specialists described above (Vinsonhaler etal., 
1983). Each case was based on -data from a child who had attended the Michigan 
State University Reading Clinic. The four simulated cases were representative, 
of reading problems commonly encountered in public schools. Grade levels in 
the cases ranged from third to seventh. 

A variety of problems were covered, including: depressed sight vocabu- 
lary, inadequate oral reading fluency, problems with application of decoding 
skills and with decoding of multisyllabic words, high frequency hearing loss, 
and comprehension problems involving the demands of content-related materials • 

All cases included an audiotap^d interview with the child and a brief 
statement of the reason for referral to the clinic (typically, below grade 
placement performance in reading-related subjects) . The rest 6£ each simu- 
lated ^case consisted of all the information (cues) that had been oollected 

/ ■ 

during testing sessions' with /that child. At the time the cases were devel- 
oped, the Reading Clinic was choosing from among a variety of formal and in- 
formal measures to collect information about the children's home, school-, and 
physical background; cognitive ability ; academic achievement; and individual 
reading performance. The items of information collected for each case (com- 
pleted forms, test scores , test booklets , examiner's comments and audiotapes) 



were stored in a portable file box. ^£ c y e inventor y listing all the 
information available was provided for each case. The cue inventory for a 
simulated case (Case 4: Dan) is shown in Appendix A. 

Each simulated case had an equivalent form — a superficially disguised 
replicate of the original prepared by changing the child's name, using alter- 
nate forms of tests, and so on (Lee & Weinshank, 1978). Thus, there were four 

original cases and four replicates. 

i 

Design - y K 

The design involved pre- and posttesting on a randomly assigned simulated 
case. There was no control group because diagnostic agreement was known to be 
stable at a very low^vel. The dependent variable was agreement with a diag- 
nosis prepared by three senior clinicians working as a group. 

The criterion diagnosis was a set of weights assigned to each stated 
diagnostic category. Higher weights were^ assigned to categories judged impor- 
tant by the group. The; child's score was the sum of the weights for the cate- 
gories mentioned in the child's d^&gnos is divided by the sum of the weights on 
the categories in the criterial diagnosis. For example, suppose the criterial 
diagnosis included sight work (with a weight of 1.0) ajid poor oral reading 
(with a weight of 0.5). If a child's diagnosis included sight words and poor 
comprehension, the child's score would be 1.0 divided by the sum of the clini- 
cians' weights (1.0 plus 0.5). 

Results & / 

Training substantially improved diagnostic agreement* Mean pretest 
agreement with the criterial diagnosis was 0.16, while mean post test agree- 
ment was 0.46L There were r*o marked differences among the treatment groups. 
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. Thus, this study confirmed the training benefits of systematizing .infor- 
mation collection and using diagnostic decision aids to reach diagnostic judg- 
merits. In addition, the study served as a f JLrst approximation for the train- 
ing and measurement methods used in subsequent studies. ' * . 

The Second Training Study (1979) ^ 
The second study attempted to determine if further improvement in » 
agreement (beyond that produced by the decision aids) would result from model- 
based training, Diagnpstic agreement was- measured by correlations between 
diagnoses of study participants for the same case, rather than agreement with 
a target diagnosis. We instructed participants in how to use a model of read- 
ing with four ' (instead of seven) critical-reading performance criteria. , 
Decision aids and practice with instruct6r feedback were both based on the 
Model of Reading and Learning to Read. 

Methods 1 

' ' ^ f 

Twent}'-eight experienced teachers (master's degree candidates with prior 

coursework in reading) received 30 hours of training in a five^week graduate 

course on reading diagnosis. 

' • o 

e 

Instruction • 

\ / 
The participants received instruction based on the Model of Reading and 

Learning to Read (MORAL). They practiced what they had\ learned on four simu- 

- ---- - • ' ( % \; ■ 

lated cases of reading diff iculty, made* diagnos tic * decisions about the cases, 

wrote them up, and received the instructor's comments about their written 

diagnoses. These training cases were different from the ones they diagnosed 

in pre- and posttests. We gave the student-participants a decision aid con- 

sisting of diagnostic/prescriptive summary forms to guide their interactions 

* 

with the simulated casesV 
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On the first day of class, we randomly assigned students to one of four 
groups (seven students in each group (^=28)). .The four groups received iden- 
tical training (e.g., the MORAL and practice on simulated cases), but were 
tested with different simulated cases. (These four simulated cases and their 

replicates were the same afc tho$e used in the initial training, study.) 
o 

We conducted classroom instruction in three-hour blocks, twice weekly, 
for five weeks. The course topics and their order of presentation were 
governed by the MORAL. A handout containing the MORAL in matrix form was dis 
tributed on the first day. We gave various demonstrations during and /or fol- 
lowing lectures. These? 1 included the administration, scoring, and interpreta- 
tion of measures used to assess the critical performance^ of reading. A 

simulated case, like those used for the pretest, served as the basis for many 

i 

t 

of the demonstrations. 

Students were required to work on one practice case each week (four in 
all). All practice cases were computer-based, but the format of the case . 
information was the same as in the manually based cases used for pre- and 
posttesting. Students completed data-base and causality checklists for each 
case and then received feedback and had the opportunity to discuss the case 
during formal class time. Although the instructor gave feedback to the total 
class, the instructor did ,not examine the diagnoses made by individual 
students. Hence, there was no assurance that the student and instructor 
examined all critical performances and likely causal' factors in every case. 

Testing ' 

The students received complete directions for diagnosing the simulated 
cases on their pre- and posttests. They read and simultaneously listened to 
recorded instructions about how to use a simulated case. To check their 



• • \ 

understanding, they were to request Information from a practice case different 
froln the one they would subsequently diagnose. After instructions, the 
students received initial contact information about the case to be diagnosed; 
this Included a short summary about the child's reading performance (e.g., 
"The child is AO years old and reading at third-grade level."). 'The students 
then had 45 minutes to collect as many cues (items of information) about the 
case as thev wished.. The instructor asked them to list on a record form, in 
order of collection, all cues they selected. 1 N 

After 45 minutes, they wrote their • diagnoses using the categories on a 
diagnostic/prescriptive summary form. This form Channelled their thinking and 
diagnostic write-up toward fdur of the critical performances (instant word 
recognition, decoded word recognition, oral reading, silent readirig comprehen- 
sion) and their causal factors. 

Then, students were asked to match their written diagnoses with diagnos- 
tic categories listed on two different checklists. This procedure standard- 
ized the student-participants' vocabulary f or 'comparison and data analysis. 
Requiring them to write their diagnoses before completing the checklists was - 
suggested from results of prior work showing that participants given the 
checklist immediately tended \ to check off all items whether or not they char- 
acterized the case • 

The MORAL data-base checklist listed 49 statements about ^child's 
reading status. The students were to mark only those categories they had 
mentioned in their diagnostic write-up. For example, students who stated frhat 
a child lacked visual memory were to check Category 7, "inadequate visual 
memory of word forms." 

i The same procedures applied for the second checklist, the MORAL causality 
checklist that included 25 statements indicating various causes of poor 



results on the four critical reading performances. For example, students v/ho 
mentioned that the child had inadequate instant word recognition because s/he 
lacked practice should have checked the first statement, "Inadequate instant 
word recognition is partially caused by insufficient independent reading prac- 
tice." on the causality checklist. 

Identical procedures took place In the, posttest session on the last class 
day. 'Students received the same case they had diagnosed }n their pretest. „ 
Foj: a complete description of procedures and materials used,, see Gil, Polin, 
VInsonhaler & VanRoekel (1980). y 
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Data Analysis and Results 

Data analysis focused on (1) the extent of group agreement (commonality) 

and (2) the percentage of diagnostic agreement amfrng students (Interclinician 

• • " i « 

agreement). Agreement statistics were calculated separately for the data-base 

and causality /checklists, and mean^are reported for all participants diagnos- 
ing the same case. Most participants marked almost all categories on the 

/ "* \ 

/ \ 

checklists, despite instructions to mark onlythose. items they wrote in their 
own diagnoses. Therefore, all categories In the checklist that had not 
appeared in the students 1 Initial diagnostic write-ups were discarded before 
analysis of diagnostic agreement. TJie reliability of this.- verification pro- 
cedure was checked by repeating It o\ a random sample of the diagnostic check- 
lists. In 85% of the decisions to . discard checklist items, both coders 
agreed. Data analyses then were run on all checklist categories that coders 



ve 



rifled had been included in the students 1 written diagnoses. 
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Diagnostic Commonality 

Commonality results forStfie pretest were higher than in the observational 
studies (0.28 for data-base .checklist and 0.26 for the causality checklist 



versus 0.20 for the observational studies). This Improved, Rroup agreement 
probably resulted from the use of decision aids based on a model of reading, 
in this case, the MORAL, In addition, mean commonality increased between the 
V P£0^ and posttests (from 0.28 to 0.36 for the data-base checklist "and from 
fl.>26 to 0.44 for the causality checklist). This change reflects improved 
group agreement resulting from model-based training' with feedback. As ih. the 
previous observational studies, the commonality results again show the per- 
vasiveness of the seven critical performances as commonly used diagnostic 

"categories, 
i, • 

Int^rclinician Diagnostic Agreement 

The mean inter-clinician correlations in Table 2 show that individual 
diagnostic agreement on the pretest was higher in the training study than in 
the observational studies. For example, for the data-based checklist, which 
includes judgments on the four critical reading performances, the mean initial 
agreements were 0.26 (Phi) 3 and 0.17 (Porter) notably higher than the 0,03 
(Phi) and 0.08 (Porter) obtained for the total diagnosis iyi the observational 
studies. The higher, mean initial agreement on the causality checklist is 
probably 7 due to tjhe use of the model-based decision aids; all other conditions 
were identical to those comnuA to the observational studies. 

Students' diagnostic agreement improved from pre- to posttests on both 
checklists. Thus, raodel-based clinical training with feedback was effect ive 
in improving individual diagnostic agreement beyond that proceed by the deci- 
sion aids. Finally, the data show that a greater improvement was obtained on 

• - . / 
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■^Explanations of Phi Correlation and Porter Statistic are found in 
Appendix B. " * J 



the causality checklist than on the data-base checklist. Students began the 
course with higher Phi correlations on the data-base checklist and improved 
less on the data-baas checklist than they did on the causality checklist. 

v • ' • 1' 1 ' 

Third Training Stud y (1980) f 
The purpose of the final training study was to evaluate the improvement 
-in diagnostic agreement that would result when the model-based training was 

more tightly controlled. The model used, the MORAL, included all seven of the 

j 

critical reading performances. Classroom instruction in the model was based 

«» 

on a text developed expressly , for traini ng (Weinshank et al., 1980). The 
model-based decision aids were redesigned such that students were forced ^to 

(1) make a yes or no decision on the status of each critical reading perfor- 
mance, (2) support that decision with case data, and (3) list probable causes 
underlying performance (see Appendix C). Model-b'ased practice was given with 
feedback specific to each student. 

Me thods * 

The 15 participants, experienced classroom teachers with little or no 

4 

formal training in reading or reading diagnosis, were chosen so that we might 
determine the effectiveness of this type of training with non-specialists. 

The student-participants were divided into three training groups, each 
with a different preceptor (i.e . , an experienced clinician who diagnoses and 
remediates according to a model of process and provides feedback on student 
decision making for specific cases). The three groups were instructed for 30 
hours and given 10 hours of extra practice time in the use of (1 ) the MORAL; 

(2) simulated and/or real cases with instructor feedback; and (3) decision 
aids that guided the interaction of simulated case users. Progress was 
monitored by means of pre-, mid-, and posttests on a simulated casta, and an 
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Table 2 

InterTClinician Correlations' 





Data Base 
Checklist 


• 


Causality 
Checklist 




Case 
1 

Phi 

Porter 


Protest 


Posttest 


Pretest 


Posttest 


.32(.13) a 
.22(.09) * 


.39(.13) 
.28(.ll) 


. 16( .27) 
.17(.16) 

« 


.38(.18) 
.39( .11) 


2 i 
Phi 

rorcer 


.23(.18) 


.47(.12) 

• )4 ^ • 1U ) 


.H(.25) , 


.37(.18) 
9 f .3 v • m ) 


3 

Phi 

Porter 


.15C.17) 
.09(.10) 


.31(.16) 
.23(.13) 


. 19( .23) 
.16(.12) 


.37(.24) 
.34(.18) 


4 

Phi 

Porter 


.33(.15) 
.22(.09) 


.36( . 12) 
.26(.10) 


.K(.26) 
.K(.18) 


.39(.20) • 
.37( .16) 


Grand mean 
Phi 
Porter 


J .26(.08) 
1 .17(.06) 


.38(.07) 
.28(.05) 


.16(.02) 
.15(.02) • 


.38(.01) 
.36(.03) 



a Standar'd deviations appear in parentheses. 

if 

< • ( 

additional posttest (transfer test) on a case not previously diagnosed. Five, 
simulated cases were used; one student from each /preceptor training group Was 
tested on each case. , » a 

The materials used in this study included the same set ok four -simulated 
cases and their replicated used in the previous training studies. In addi- 
tion, a new simulated* 1 case and replicate were developed to provide an example 

of a reading comprehension problem in an older child. 

% ' J: ( 

For two of the groups, the formal classroom instruction in dreading diag- 
nosis was conducted in weekly three-hour blocks with additional time spent 
outside the class diagnosing computer-based simulated cases (as opposed to the 



frit 



manually-based onus used for Clio tost hosmIoiuj). After oxatQ I n t n m tinu 1 al ml 
ease, students fiLled out t;he decision-aid d laj'nos In sheets. Then they trans- 
laCed their diagnoses to a standardised checklist, Indicating whether the cam 
showed adequacies or inadequacies In the seven critical reading performances 
and their causal factors as postulated by the MORAL. Students in the remain- 
ing group, who used real, not simulated cases did. not use the checklist. 
Instead, their preceptor analyzed the real cases diagnosed by each student in 
class . 

Testing 

The testing procedure replicated that used in the second training study 
except that 

1. there were five simulated cases rather than four, j 

2. there were four testing sessions (pre-, mid- ^ posttest, and 
transfer of training) rather thap just pre- and posttests; and, 

* 

3. a revised, model-based, diagnostic decision aid (discussed above) 
and checklist were developed using the MORAL (with its seven 
critical reading performances rather than four). 

A portion of the decision aid is shown in Appendix C. 

The decision aid forces the individual to 

s j t 

!• make a judgment about the adequacy or Inadequacy of each 
critical reading performance, 

2. indicate the case Information used in making the decision, 

3. list likely causal factors underlying performance, and 

4. suggest remedial strategies. V 

This decision aid was based on the problera-oriented medical record 
developed by Weed (1976). \ 

The MORAL required participants to say whether the child in each case 
performed adequately or inadequately on each category of critical reading 



performance. Subaeta of dlaguoHtle rat Htforlmi umlur each n I r It-al ~t *uu\ I ng- 
performance category Included rotated causal factors. Under each rr it leal per- 
formance, an "other" category accommodated thorn* d laguont Ir statement a by 
students that could not be translated Into existing cat egor lea. f 1 1 addit ion, 
some causal factors related to learning were Mated separately at t: lit* end of 
thu checklist. A portion of that checklist la shown In the Append I k I). 

The pretest was administered prior to any group meetings. Identical pro- 
cedures were followed for the midtest (approximate! ly five weeks later) and the 
posttest (at the end of 10 weeks). On each test, students diagnosed the same 
case; thus, a progress profile was established. A week after the first post- 
test, a second posttest (the transfer test) was given In which participants 
diagnosed a different simulated case, one they had never seen. (For a 
complete description of procedures and materials used, see Polin, 1981.) 

Results 

Observations of Instructional Activitie s 

Activities during all sessions with the three preceptors* were recorded 
continuously, with times noted at approximately 10-minute intervals. The 
recorded observations were coded into three descriptive categories: (1) type 
of interaction, (2) topics covered, and (3) sources of topics. (Study partic- 
ipants will be referred to as students since they were involved in the studies 
as students in classes.) 

Table 3 is an excerpt of an observation protocol and its translation to 
the coding sheet. 
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(Mib^rvrfl I « mi K is hi |»r 
Jjl^'VVlUjiy t 'oil i fi|« itf lochi a* i ton 

I't iM'rpturs "What in (*uttMl If 1 gtvt* : l'r»*c*ipior qimar tontiij; , 

4 HlUd ritMlttMK't'S |»> (Ult ritltitrtlU aa*»Wtt l' 1 H|». 

in order?" 

I'receptor: "Follow mi analytical comae, |*te<*e«ttor Im'f or in*: , 

tlaa aa low t^Htrf nr* ponulhle. a t utlt* \t a ltaraottig 

U Inutaut word rucognft ton 0H: Inarauf word reco^ntc ton 

la a problem, h° to DOM.'H #1 'it f.'ue collation 

or Home fiiii'h Inn t ant word" 
recognl t too teat « 44 

Tab L<* shows Liu* types of lot **ract Ions preceptors ami students engaged 
in (hiring the 10 training sens loan and what proportion of lO-mlnute negiaenta 
from each t\f 10 sessions were spent In er.eh type of Interaction* 

In addition, the table shows the proportion of 10-minute blocks in which 
each topic was observed. Ah ran bo" seen, a great deal of the t line was spent 
discussing critical reading performances. 

In summary, preceptor training In this study la characterized by 
lecturing and question answering o n a common net of topics consisting mainly 
of the critical reading performances. Preceptors differed In the sources they 
preferred to use for discussing the topics. One relies heavily on personal 
experiences, standardized tests^-aed real cases as springboards for discus- 
sion; another prefers more formal sources: written materials and simulated 
cases . 

Individual Agreement: Students with Students 
Versus Students with Their Preceptors 

On the basis of the earlier training studies, we expected that agreement 

among students on the pretest would be higher than that obtained in the 
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t %\t t*r act 1 tut 
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.•)') 


.00 


.5) 


1* t|ttt*at torn* , S rtimWcit ti 


.*>«) 


.74 


.38 


1* qoeat toot* , 1* anawrsra 


.1* 


. 1 I 


.00 


S tntktt, P Udtdiib 




.58 


.51 


S qu<4t)t lont* » P anuw^rn 




.46 


.25 


S queattouti, S anawnrtt 


.OA 


.04 


.04 




.2? 


.17 


.09 




T«»i> It's 






time and word reootfnlri 


.48 


.7.6 


. 15 


Decoded word reeognitlo 


.40 


.4 3 


.46 


Or 4 1 read t ok 


.2A 


.23 


.33 


Rending coraprehena Ion 


.27 


.32 


.28 


Message comprehension 


.22 


.13 


.21 


Word comprehend ion 


.27 


.33 


.21 


Ac tent Ion 


.20 


.16 


.12 


Cue cot lect; Ion 


.60 


.55 


.12 


Other 


.62 


.5A 


.77 


Topic Sources 


P personal experiences 


* 

.55 


.17 


.OA 


S personal experiences 


.27 


.20 


.00 


Dx materials in general 


.19 


.38 


.00 


Dx Rx tables 


.06 


.39 


.14 


Dx Rx checklists 


.00 


.07 


.30 


Dx Rx glossary 


.00 


.00 


.05 


Dx Rx decision aid 


.01 


.45 


.21 


Cases In general 


.19 


.04 


.02 


Simulated cases 


.01 


.43 


.46 


Mini-cases 




.04 


.00 


Real cases known by S 


.60 


.20 


.14 


Real cases known by P 


.30 


.00 


.00 


Anecdotes 


.48 


.12 


.02 


Tests In general 


.18 


.30 


.02 


Standardized tests 


.62 


.45 


.21 


Non-standardized tests 


.08 


.35 


.11 


Textbooks , printed documents 


.10 


.18 


.07 


MORAL 


.38 \ 


.04 


.04 


Other 


.29 


.24 


.16 

1 
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Note :^ P; preceptor, S: student, Dx: diagnosis, Rx: remediation 



26 



observational studies because of the availability of decision aids. Further, 
it was expected that agreement would increase as a result of training, 

Tti^ mean agreement across cases for the total diagnosis is shown in 
Figure .1 and in the Appendix E. Three different sets of individual diagnostic 
agreement data are represented: 

1, ? agreement among the three preceptors, measured at the end of 

the training program;- 

2, agreement of students with their own preceptors; and 

3, agreement among students across training groups on a given case. 
Student agreements^with themselves and their preceptors are shown for pre- 

tests, midtests, post test's > and transfer tests. Agreement among preceptors . 

* ' ' f 

and am9ng students 'reflected the influence of the decision aid. As can be 

seen in Figures 1 and 2, preceptor agreement is markedly higher than the mean 

value obtained in the series of observational stuaies (Porter = 0,37 vs. 0,08; 

Phi 23 0.46 vs. 0,03). For the untrained students, initial diagnostic 
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Figure "1. Total diagnosis (mean Phi) 



agreement was higher than that obtained by the experienced specialists in the 
observational studies (Porter 23 0.26 vs. 0.08; Phi a 0.34 vs. 0.03). In addi- 
tion, the students showed modest gains across the pre-, raid-, and posttests 
both for student /student and student /preceptor agreement. On the pretest, the 
students agreed more with their preceptors- than with one another. After 
training, this difference decreaj::d to zero on the posttest. On the transfer 
test, individual agreement of student with student was maintained, but the 
individual agreement of student with preceptor actually declined to the pre- 
test level. 

To summarize, the overall improvement due to training and decision aids 
is impressive compared to that of, clinicians working from their traditional 
training and without decision aids. The improvement transfers to cases not 
previously diagnosed and influences practitioners and experienced clinicians. 
Further, the student/student agreement shows sustained improvement from. pre- 
test to transfer test. However, while the student/preceptor agreement shows 
improvement from pre- to midtest, a puzzling decline appears from raidtest, to 
posttest, to transfer test. .. 

We analyzed the data further to find an explanation for this unexpected 
decrease in agreement between students and preceptors. First, we wanted to 
determine if the decrease in .agreement held for both critical reading perform- 
ances and causal factors. The data for critical reading performances are 
shown in Figure 2 and Table 5. The data reveal that the effect (i.e., the 
higher level of agreement of students with each other than with their precep- 
tors) not only held for the transfer test but the posttest as well. As with 
total diagnosis, student/student agreement on the pretest was lower than 
student/preceptor agreement. ' , • ■ . 
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Figure 2. Critical reading performances (mean Phi) 
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Figure 3 shows the data for causal factors. As can be seen, the profiles 
parallel those for total diagnosis, except that there was generally lower 
agreement on causal factors than on total diagnosis. 

Several hypotheses were suggested to account for the results showing that 
student/student agreement consistently outstripped student/preceptor agree- 
ment. The first was that students might have fewer diagnostic categories than 

the preceptors, and thus perhaps had higher agreement because they stuck to 
% * 

the more obvious observations and provided less details. This hypothesis had' 

i . . 

to be rejected. Students did use fewer categories than the preceptors on the 
pretest (33 vs. 38), but they actually used substantially more categories on 
the posttest (45) and transf etsfcest (50). 

A second hypothesis about the students' higher reliability on the trans- 
fer test was that the students' diagnoses might have been simpler than those 
of the preceptors (i.e., might have contained more statements about critical 
reading performances and fewer .about complex matters of causality). 

To explore this possibility we identified the diagnostic categories used 

by all clinicians or all students. There were 66 such categories, about half 

the total number of categories on the checklist. Using this as a base, we 

identified (1) those categories agreed upon by all preceptors but not by all 

students; and (2) those agreed upon by all students but not by all preceptors. 

These categories were identified for pretest, raidtest, posttest and transfer 

test. Finally, the categories were divided into those dealing with critical * 

performances and those dealing with causaiCf actors. \ k 

The proportion of diagnostic" categories identified as critical perfor- 

\ • ■ . 

manees for the pretest, posttest, and transfer tests indicates the. relative 

use of these categories by students and preceptors. 



As hypothesized, the diagnostic categories most agreed upon by students 
on the pretest were mainly critical reading performances (0.87 for students 
vs. 0.41 for preceptors). However, on the posttest and transfer tests the 
diagnostic, categories most agreed upon by students included fewer critical 
reading performances than did the categories most agreed Upon by preceptors, 
(0.27 vs. 0.39 on the posttest and 0.39 vs. 0.47 on the transfer test). 

A third hypothesis was that the students might have formed cohort groups, 
discussed the same cases, and in this way increased their agreement with each 
other and lowered their agreement with their preceptors. However, the agree- 
ment statistics were calculated between students from different training 
groups. Since the groups met on different days, had a different case order, 
and practiced on cases only within their groups, it is highly unlikely that 1 
there was any cross-group practice to account for the increased agreement .ex- 
hibited on the post and transfer tests. Thus we are left with the hypothesis 
that the students actually had become more systematic in diagnosing cases .than 
their preceptors. 

Commonality Results * 
Fur.ther eviderfce of the impact of training and decision aids on the % 
agreement in content of diagnoses is offered by commonality. Commonality is a 
measure of group agreement. The diagnostic categories with high commonalities 
are those which 'best characterize the group diagnosis of a givep case. Over- 
all, the mean commonality on the pretest was higher .in this study than in the 
observational studies (0.54 on the total diagnosis vs. 0.20 for the observa- 
-tional studies). Commonality also increased from pre- to posttests (from 0.54 
for the £otal diagnosis to 0.67). These results confirm the contribution of 
model-based training and decision aids to group agreement. In this, as in 



all prio£ studies, the critical .reading performances figured prominently in 
the group diagnosis (see Table 5). v 

As rtay be seen from the table, most of the commonalities are 1.0, meaning 
that there was complete agreement on the critical reading performances seen as 
characterizing the case. However, group agreement was not uniformly distrib- 
uted over the seven critical reading performances. Decoded word recognition 
and listening comprehension had the highest group agreement; meaning vocabu- 
lary and attention/motivation had the lowest. One possible cause of this is 
that the simulated c&ses lack hard data on the latter two factors. Anojfctfet 
possible cause is that these indicators are inherently ill-defined Within the^ 
^f ield of reading. * 

The next analysis concerns the causal factors most frequently mentioned 
by preceptors and students across all five cases (Table 6). . 

Analysis of the, table suggests that four major types of causal factors 
are most frequently agreed upon across the cases: (1) interactions of criti- 
cal reading performances, (e.g., instant word recognition and decoded word 
recognition as interfering with oral reading proficiency); (2) subskills of 
the critical reading performances (e.g., sound-symbol associations for vowels 

« 

and segmentation of syllables as causes for decoded word recognition; (3) 
overall perceptual problems (e.g.*, with visual memory and visual discrimina- 
tion); and (4) general factors that affect learning (e.g., the amount of prac- 
tice, motivation, etc.).* 

For the students in this training study, at least, the critical reading 
performances not only were impottant as diagnostic categories, but served as 
the foundation for examining causes of reading problems. 
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Summary and Discussion 
An earlier series of observational studies (Vinsonhaler et al., 1983) ( 

showed vc/kV loif individual diagnostic agreement of experienced clinicians with 

f 

each other a.\d with themselves on the same simulated case of reading difficul- 
ty. However, the results of three training studies in reading diagnosis 
showed that diagnostic reliability (agreement) can be raised from approximate- 
ly zero to about 0.66 through improved training. The training included (1) 
instruction on a model of the reading process, (2) decision aids based^on.the 
model of process, and (3) practice with feedback on simulated cases presented 
on a minicomputer (DEC PDP8). 

The first study examined training not based on an explicit model of the 
reading process. Instead, training focused on (1) decision kids to make sue 
collection and diagnostic reporting a routine and systematic process and (2) 
practice with decision aids on simulated cases. Instruction took place within 
a diagnostic course for graduate students in reading. Some of the students 
were practicing teachers. The results showed a marked increase in agreement 
o£ student diagnoses with critical diagnoses compiled by a group of senior 
reading clinicians. 

in the second study, training was based on an explicit model of the 
reading process emphasizing four critical reading performances. The content 
of instruction included the model of process and applications of the model to 
diagnosis. The decision aids (including a diagnostic record form and check- 
list ) : Were explicitly based on the model of process. The practice with 
simulated cases on the 1 minicomputer included instructor feedback based oh the 
model. Small-group instruction took place within a reading diagnosis course 
for. graduate students, all of whom were teachers with prior graduate training 
in reading. Results were analyzed separately for agreement on critical 



rending performancea and on cmuial factorn. Individual dlagnoatle agrtn»nu*nr 
on the pretest wan notably higher in thta n truly than In tht« nerleH of o|>fl«rva- 
tional studies (e.g., a mean Phi of ,26 vs. #03 in the earlier titudimi). Thin 
pretest difference can probably be attributed to the use of decision aidfl 
since all other testing conditions were identical. Further, student** 1 Indi- 
vidual diagnostic agreement improved from pre- to postteat on both check! lata 
(e.g., from 0.26 to 0.38 on the data-base checklist) indicating further im- 
provement as a result of model-based training and practice. 

The final study examined the impact on individual diagnostic agreement of 
more refined, extended, and better controlled model-baaed training than in the 
second study. New instructional materials and new diagnostic record forms and 
checklists were developed from a model that had seven critical reading per- 
formances. Practice was scheduled on the minicomputer and feedback provided 
by preceptors for students' diagnoses of five, rather than four, cases. 
Small-group instruction in diagnosis was provided to practicing teachers with 
no prior graduate training in reading. Results were analyzed separately com- 
paring the diagnosis of student with student, and student with preceptor on 
(1) totai diagnosis, (2) critical reading performances, and (3) causal fac- 
tors. The results of the third study confirmed the findings of the second 
study except that agreements were generally much higher (e.g., pretest 
student/student agreement on the^critical reading performances was, 0.39 and^^ 
posttest agreement was 0.66). 

We see three major implication8^of^thi8-work.., r ^irst , diagnoses, in read- 
ing can have all the virtues proposed for it in the literature provided its 
reliability and validity can be established. Second, for reliability to be 
improved, present methods of diagnostic training must be modified to include 
the type of training reported here. Third, if the validities of diagnoses are 



1 

to bo established, empirical atudies of remediation baaed on dintfnoa^a of 
known reliability rauat be performed. The author* are preamuly conducting 
audi validity sttidlea. Methodological laaoaa are dlacuaaed hy Wanner (19H2) 9 1 
a^td results will be reported in aiihaequent papers. / 

Kecomman d a t l ong 

Baaed ort our research, we would propose the following method for imple- 
mentlng model-baaed training In existing inservlce and preservlee programs, 
First, select a model of the reading proceBH that lenda Itaeif to directing 
specific diagnostic and remedial actions. The skllls-baaed model choaen here 
Is but one example. Second, create (I) Instructional mater lain, that teach the 
model directly aijd (2) decision aids that help the stilflent apply the model to 
diagnostic decision making. Those used In these studies provide examples of 
such materials and aids. 

Third, provide the means to (1) give practice on simulated cases with In- 
divldualized, model-based feedback; and (2) monitor changes in reliability for 
pre- and posttesting and possibly for certification testing. 

The computer-based method used in the present sttidy worked well. Sets of 
programs for presenting simulated cases on small computers have proven both 
time and cost effective. Versions of these programs are under development for 
low-cost microcomputers (e.g., In BASIC for the Apple II Plus). p 

Finally, all these resources can be integrated easily into existing 
courses in reading diagnosis or via an additional clinical practicum. 

In summary, our earlier studies uncovered severe problems with diagnostic 
reliability in readings The studies reported here have documented a potential 
solution to the reliability problem based on changes in training. Responsi- 
bility for the long-range solution to the problem rests with the educational*****, 
community. 

3 a 
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Appendix A / 
Cue Inventory for Case/4, Dan 



Physical Information 



Assessment Inf ormat ion (Cont • ) 



Vision test 
Audiometric record 

b i . / 
Background Information 

School record / 
Teacher form / * 
School information 
Parent form/ 

1 ■ >: 

Assessment .Informatio n 

Basic sight vocabulary (Dolch) 

Sentence completion 

Gates-McKillop reading 

diagnof^ic tests. ...... 

Recognizing & blending 

commond word parts 
Auditory blending 
Phonic spelling of words 
Giving letter sounds < 

Auditory discrimination (Wepman) 

Durrell list-read series 

Intermediate level vocabulary 
Paragraphs 



Durrell diagnostic aitalysis of 
.reading difficulty 

Oral Reading 

Silent Reading 

Li&t. Comprehension 

Word Recognition & Word analysis 

Hearing sounds in words — 
primary 

Visual memory of words — 
primary 

Intermediate Spelling — List 1 

Phonic spelling of words 
Achievement test (Iowa Test of 
Basic Skills) 

Vocabulary 

Reading 

Graded word list (Slosson Oral 

Reading Test) 
Reading achievement (Gates- 

MacGinitie) 

Speed/ Accuracy 
Cognitive ability, (Wechsler 

Intelligence Scale for Children) 

Verbal 

Performance 

Full scale 
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Appendix B 



Calculation of Phi Correlation 
and Porter Statistic 



Clinician i SIMCASE Q, Form One 

present (+) Absent (-') 



SIMCASE Q, 
Two 


Frequency count of 
statements present in 
the domain in both 
sessions for form one 
and form two of SIM- 
CASE 

A 


Frequency count of 
statements present in 
the domain present in 
SIMCASE form two but 
not in SIMCASE form 
one • 

B 


(fl o 


Frequency count of 


Absent in both 


•H 1*4 
U 


statements in the 


sessions for form 


•H 
G 


domain present in 


one and form two of 


•H 
rH 


the session for SIM- 


SIMCASE 




CASE form one but not 






SIMCAf5E form two 






! c 


D 

— ± 




- c (-+ 



a + c 



d (--) 
b + d 



a + b 
c + d 



Phi = la x d - b x c) 

*-» f — 

(a + CJ x (b + d) x (c + d) x (a + b) 
The presence of a large percentage of 1 statements (more than 85%) in the 
"D" cell (the. statement is absent in both sessions) artifically inflated the 

V <^ 

intercorrelatior^ since it represented, in effect, agreeing to disagree. A 
statistic development by Professor A. Porter (Institute for Research on 
Teaching, Michigan State University) was designed to correct for this 
occurrence, by including inthe computation only, the values in the A, B, and 



C cells 



A + B + C 
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Appendix C 

A Portion of a Diagnostic Decision Aid (1980 Study) 
Case Name: Dan (Grade 4) 

Does the student have a problem witfi INSTANT WORD RECOGNITION? 

(Circle One) . Yes No 

On what basis was this decision made? 
SORT Score: 2.1 

Durrell, Word Analysis and Word Recognition = Vow first grade 
If no, then continue with the next problem area on page 3. 

If yes, describe the important factors that have contributed^ to this problem. 
For each factor, suggest remedial procedures for Its improvement. Continue on 
the next page if required. 

1. Describe one factor contributing to the problem with Instant Word 
Recognition. 

Ban has poor visual memory of words* 
Suggest remedial procedures for alleviating this factor. 

He needs to look at the whole word not just the beginning letters. 

.2. Describe another . factor contributing to the problem with Instant Word 
i Recognition. 

Van does not do enough reading outside of class. 
Suggest remedial procedures for alleviating this factor. 

Barents need to devise a plan to encourage Dan to read- more, possibly 
using a reward system for the., amount of reading he does. 



Appendix D 
A Portion of the Diagnostic Checklist 



Case Name a 
Your Name 
Date 



^1 Instant Word Recognition Adequate 

2 ^Instant Word Recognition Inadequate 

3 Basic Sight Words Adequate 

4 „ Basic Sight Words Inadequate V 

5 Sight words Learned Via Decoding Acjgquate 

6y Sight Words Learned Via Decoding Inadequate 

7 Experiential Sight Words Adequate ^ — " — ^ 

8 ^_ Experiential Sight Words Inadequate 

9 Visual Discrimination Adequate 0 s 

1 0 Visual Discrimination Inadequate 

11 Visual Memory Adequate % 

12_ Visual Memory Inadequate /• 

13 Print-Cleaning Assoc£afe-£tm Adequate 

, 1 4 r , Print-Meaning Association Inadequate 

1 5, Pritat-Sound Association Adequate 

16 Print-Sound Association Inadequate 

1 7 Other Adequate ' x 

18 . '. . : , Other Inadequate 

19 Decoded Word ^Recognition Adequate 

20_ Decoded Word/Recognition Inadequate 

21 Sound-Symbol Association - Consonants Adequate 

2 2 SQ^nd^Symbol Association - Consonants Inadequate 

23 Sound-Symbol Association - Blends /Diagraphs Adequate 

24 ( , Sound-Symbol Association:- Blends /Diagraphs Inadequate 

25 Sound-Symbol Association - VowelsJ^Vowel Patterns Adequate 

26 Sound-Symbol Association - Vowels'/Vowel Patterns Inadequate 

\ 2 7 .Visual Segmentation, into Syllables Adequate 

28 Visual Segmentation, ijito Syllables Inadequate 

29 Auditory Segmentation into Syllables Adequate 

3 0 Auditory Segmentation into Syllables Inadequate 

3 1 Blending of Sounds Adequate , ' _ 

3 2 Blending of Sounds Inadequate 

33 ^ . " Adjustment of Blended Sounds to Language Adequate 

3 4 Adjustment of Blended Sounds to Language Inadequate 

3 5 Use of Root Word Adequate 

3 6 V Use of Root Word Inadequate 

3 7 ' Use of Prefixes Adequate 

3 8 Use of Prefixes Inadequate 

3 9 Use of Suf f ixe,s Adequate 

4 0 Use of Suf fixes - Inadequate - 

41j Auditory Memory Adequate 

42^ Auditory Memory Inadequate 

43 Auditory Discrimination Adequate 

4 4 Auditory Discrimination Inadequate 

4 5 Visujal Memory Adequate . . $ 

4 6 Visual Memory Inadequate 

47 / Visual Discrimination Adequate 

48 . Visual Discrimination Inadequate 

49 * Other Adequate % 

5 U / Other Inadequate > 
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Mean Studer 



